REESE: A Method of Soft Error Detection in Microprocessors
نویسندگان
چکیده
Future reliability of general-purpose processors (GPPs) is threatened by a combination of shrinking transistor size, higher clock rates, reduced supply voltages, and other factors. It is predicted that the occurrence of arbitrary transient faults, or soft errors, wi l l dramatically increase as these trends continue. In this papec we develop and evaluate u fault-tolerant niicroprocessor architecture that detects soft errors in its own datu pipeline. This architecture acconiplishes soft error detection through tinie redundancy, while requiring little execution tinie overhead. Our approach, called REESE (REdundant Execution using Spare Elements), jirst niininiizes this overhead and then decreases it even further by strategically adding a m a l l niiniber of functional units to the pipeline. This differs from siniilar approaches in the past that have not addressed waj~s of reducing the overhead necessaqt to iniplenient tinie redundancy in GPPs.
منابع مشابه
Oware: Operand width Aware Redundant Execution for Whole-Processor Error Detection
As the feature size of semiconductor technology continues to shrink, high-performance microprocessors are increasingly susceptible to soft errors. Exploiting the fact that narrow-width values universally exist in applications, prior in-register duplication approaches for improving reliability of register file and other data-holding components mitigate performance cost but leave the rest of data...
متن کاملSoft error tolerant Content Addressable Memories (CAMs) using error detection codes and duplication
Soft Errors are becoming a major concern for modern computing systems. Memories are one of the elements affected by soft errors, which cause bitflips in some of the cells. A number of techniques such as the use of Error Correction Codes (ECCs), interleaving or scrubbing are utilized to mitigate the effects of soft errors on memories. Content Addressable Memories (CAMs) pose additional challenge...
متن کاملA Fault Detection Method for Combinational Circuits
As transistors become increasingly smaller and faster and noise margins become tighter, circuits and chip specially microprocessors tend to become more vulnerable to permanent and transient hardware faults. Most microprocessor designers focus on protecting memory elements among other parts of microprocessors against hardware faults through adding redundant error-correcting bits such as parity b...
متن کاملAn approach to fault detection and correction in design of systems using of Turbo codes
We present an approach to design of fault tolerant computing systems. In this paper, a technique is employed that enable the combination of several codes, in order to obtain flexibility in the design of error correcting codes. Code combining techniques are very effective, which one of these codes are turbo codes. The Algorithm-based fault tolerance techniques that to detect errors rely on the c...
متن کاملChecker Backend for Soft and Timing Error Detection and Recovery
Current microprocessors are becoming more vulnerable to cosmic particle strikes and parameter variations. Particle strikes may cause soft (transient) errors, whereas high variability (due to process, temperature and voltage) may transform non-critical paths into critical paths, resulting in timing errors. This paper proposes a design that exploits the benefits of clustering for detecting and re...
متن کامل